Method and apparatus for performing prediction using template-based weight

ABSTRACT

The present invention relates to a method for performing bi-prediction using a template-based weight, comprising the steps of: determining a template region for performing bi-prediction of a current block; calculating an uncorrelation factor on the basis of the template region, wherein the uncorrelation factor denotes a value indicating the uncorrelation between a template of the current block and a template of a reference block; determining a first weight parameter of the current block on the basis of the uncorrelation factor; and obtaining a first prediction value of the current block using the weight parameter.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is the National Stage filing under 35 U.S.C. 371 ofInternational Application No. PCT/KR2017/010503, filed on Sep. 22, 2017,which claims the benefit of U.S. Provisional Applications No.62/398,523, filed on Sep. 23, 2016, the contents of which are all herebyincorporated by reference herein in their entirety.

TECHNICAL FIELD

The present invention relates to a method and apparatus forencoding/decoding a video signal and, more particularly, to a method andapparatus for determining an optimal prediction weight using a templateand performing inter prediction.

BACKGROUND ART

Compression encoding means a series of signal processing technology fortransmitting digitalized information through a communication line or forstoring digitalized information in a form appropriate to a storagemedium. Media such video, an image, and a voice may be a target ofcompression encoding, particularly, technology that performs compressionencoding using video as a target is referred to as video compression.

Next generation video contents will have a characteristic of a highspatial resolution, a high frame rate, and high dimensionality of scenerepresentation. In order to process such contents, memory storage,memory access rate, and processing power technologies will remarkablyincrease.

Therefore, it is necessary to design a coding tool for more efficientlyprocessing next generation video contents.

DISCLOSURE Technical Problem

The present invention is to propose a method of encoding, decoding avideo signal more efficiently.

Furthermore, the present invention is to propose a method of determiningan optimal prediction method among various prediction methods.

Furthermore, the present invention is to propose a method of explicitlytransmitting a weight index.

Furthermore, the present invention is to propose a method of determininga template region.

Furthermore, the present invention is to propose a method of determiningan uncorrelation factor based on a template region.

Furthermore, the present invention is to propose a method of obtainingan optimal predictor using an adaptive weight parameter in a generalizedbi-prediction.

Technical Solution

In order to accomplish the objects,

the present invention provides a method of determining an optimalprediction weight using a template.

Furthermore, the present invention provides a method of obtaining anoptimal predictor using an adaptive weight parameter in a generalizedbi-prediction.

Furthermore, the present invention provides a method of obtaining anoptimal predictor by explicitly transmitting a weight index.

Furthermore, the present invention provides a method of determining atemplate region based on neighboring pixels of an L0 reference block, anL1 reference block and a current block.

Furthermore, the present invention provides a method of determining anuncorrelation factor indicating an uncorrelation between an L0 referenceblock and a current block and an uncorrelation between an L1 referenceblock and the current block based on the template of the L0 referenceblock, the L1 reference block and the current block.

Furthermore, the present invention a method of adaptively determining aweight parameter using an uncorrelation factor.

Advantageous Effects

The present invention can obtain a further improved predictor using anadaptive weight parameter in a generalized bi-prediction.

Furthermore, the present invention can obtain a further improvedpredictor and improve encoding efficiency of an image by adaptivelydetermining a weight parameter for an L0 predictor and an L1 predictor.

DESCRIPTION OF DRAWINGS

FIG. 1 is a block diagram illustrating a configuration of an encoder forencoding a video signal according to an embodiment of the presentinvention.

FIG. 2 is a block diagram illustrating a configuration of a decoder fordecoding a video signal according to an embodiment of the presentinvention.

FIG. 3 is a diagram illustrating a split structure of a coding unitaccording to an embodiment of the present invention.

FIG. 4 is an embodiment to which the present invention is applied and isa diagram for illustrating a prediction unit.

FIG. 5 is an embodiment to which the present invention is applied and isa diagram for illustrating a quadtree binarytree (hereinafter referredto as a “QTBT”) block split structure.

FIG. 6 is an embodiment to which the present invention is applied and isa diagram for illustrating a uni-directional inter prediction and abi-directional inter prediction.

FIG. 7 is an embodiment to which the present invention is applied and isa diagram for illustrating a generalized bi-prediction method.

FIG. 8 is an embodiment to which the present invention is applied and isa flowchart for illustrating a process of determining an optimalprediction method.

FIG. 9 is an embodiment to which the present invention is applied and isa flowchart for illustrating a process of performing an optimalbi-directional prediction using a weight index.

FIG. 10 is an embodiment to which the present invention is applied andshows a syntax structure in which an optimal bi-directional predictionis performed using a weight index.

FIG. 11 is an embodiment to which the present invention is applied andis a diagram for illustrating a method of performing an adaptivebi-directional prediction using a template.

FIG. 12 is an embodiment to which the present invention is applied andis a flowchart for illustrating a process of performing an adaptivebi-directional prediction using a template.

FIG. 13 is an embodiment to which the present invention is applied andis a diagram illustrating a method of determining a template region inorder to perform an adaptive bi-directional prediction.

FIG. 14 is an embodiment to which the present invention is applied andis a flowchart for illustrating a process of performing an adaptivebi-directional prediction based on a template without using a weightindex.

FIG. 15 is an embodiment to which the present invention is applied andis a flowchart for illustrating a process of performing an optimalbi-directional prediction based on a method of determining two or moreweights.

BEST MODE

The present invention provides a method of performing a bi-directionalprediction using a template-based weight, including determining atemplate region for performing the bi-directional prediction on acurrent block; calculating an uncorrelation factor based on the templateregion, wherein the uncorrelation factor means a value indicating anuncorrelation between a template of the current block and a template ofa reference block; determining a 1^(st) weight parameter of the currentblock based on the uncorrelation factor; and obtaining a first predictorof the current block using the weight parameter.

In the present invention, the template region indicates a set of L linepixels (L=1˜(N−1)) neighboring at least one of left, upper andupper-left vertexes of the current block or reference block having anN×N size or indicates a set of L line pixels (L=1˜(N−1)) neighboring atleast one of a left, right, upper side, lower side and four vertexes.

In the present invention, the uncorrelation factor is derived by the sumof absolute values of pixel value differences between the template ofthe current block and the template of the reference block. Theuncorrelation factor includes an L0 uncorrelation factor for an L0reference block and an L1 uncorrelation factor for an L1 referenceblock.

In the present invention, the first weight parameter of the currentblock is determined by an equation ((the L1 uncorrelation factor)/(theL0 uncorrelation factor+the L1 uncorrelation factor)).

In the present invention, the method further includes determining a2^(nd) weight parameter of the current block as a ½ value and obtaininga second predictor using the 2^(nd) weight parameter and determining anoptimal predictor of the current block based on a rate-distortion costof the first predictor and the second predictor.

In the present invention, the method further includes signaling anadaptive weight index corresponding to each weight parameter. The firstweight parameter and the second weight parameter are indicated by theadaptive weight index.

In the present invention, the second weight parameter is determinedbased on a predetermined weight table.

The present invention provides an apparatus for performing abi-directional prediction on a current block using a template-basedweight, including an inter prediction unit configured to determine atemplate region for performing the bi-directional prediction on thecurrent block, calculate an uncorrelation factor based on the templateregion, determine a 1^(st) weight parameter of the current block basedon the uncorrelation factor, and obtain a first predictor of the currentblock using the weight parameter; and a reconstruction unit configuredto reconstruct the current block using the first predictor. Theuncorrelation factor means a value indicating an uncorrelation between atemplate of the current block and a template of a reference block.

In the present invention, the inter prediction unit is configured todetermine a 2^(nd) weight parameter of the current block as a ½ valueand obtaining a second predictor using the 2^(nd) weight parameter anddetermine an optimal predictor of the current block based on arate-distortion cost of the first predictor and the second predictor.

In the present invention, the apparatus further includes a parsing unitconfigured to extract, from a video signal, an adaptive weight indexcorresponding to each weight parameter. The first weight parameter andthe second weight parameter are indicated by the adaptive weight index.

MODE FOR INVENTION

Hereinafter, a configuration and operation of an embodiment of thepresent invention will be described in detail with reference to theaccompanying drawings, a configuration and operation of the presentinvention described with reference to the drawings are described as anembodiment, and the scope, a core configuration, and operation of thepresent invention are not limited thereto.

Further, terms used in the present invention are selected from currentlywidely used general terms, but in a specific case, randomly selectedterms by an applicant are used. In such a case, in a detaileddescription of a corresponding portion, because a meaning thereof isclearly described, the terms should not be simply construed with only aname of terms used in a description of the present invention and ameaning of the corresponding term should be comprehended and construed.

Further, when there is a general term selected for describing theinvention or another term having a similar meaning, terms used in thepresent invention may be replaced for more appropriate interpretation.For example, in each coding process, a signal, data, a sample, apicture, a frame, and a block may be appropriately replaced andconstrued. Further, in each coding process, partitioning, decomposition,splitting, and division may be appropriately replaced and construed.

FIG. 1 shows a schematic block diagram of an encoder for encoding avideo signal, in accordance with one embodiment of the presentinvention.

Referring to FIG. 1, an encoder 100 may include an image segmentationunit 110, a transform unit 120, a quantization unit 130, an inversequantization unit 140, an inverse transform unit 150, a filtering unit160, a DPB (Decoded Picture Buffer) 170, an inter-prediction unit 180,an intra-prediction unit 185 and an entropy-encoding unit 190.

The image segmentation unit 110 may divide an input image (or, apicture, a frame) input to the encoder 100 into one or more processunits. For example, the process unit may be a coding tree unit (CTU), acoding unit (CU), a prediction unit (PU), or a transform unit (TU).

However, the terms are used only for convenience of illustration of thepresent disclosure, the present invention is not limited to thedefinitions of the terms. In this specification, for convenience ofillustration, the term “coding unit” is employed as a unit used in aprocess of encoding or decoding a video signal, however, the presentinvention is not limited thereto, another process unit may beappropriately selected based on contents of the present disclosure.

The encoder 100 may generate a residual signal by subtracting aprediction signal output from the inter-prediction unit 180 or intraprediction unit 185 from the input image signal. The generated residualsignal may be transmitted to the transform unit 120.

The transform unit 120 may apply a transform technique to the residualsignal to produce a transform coefficient. The transform process may beapplied to a pixel block having the same size of a square, or to a blockof a variable size other than a square.

The quantization unit 130 may quantize the transform coefficient andtransmits the quantized coefficient to the entropy-encoding unit 190.The entropy-encoding unit 190 may entropy-code the quantized signal andthen output the entropy-coded signal as bit streams.

The quantized signal output from the quantization unit 130 may be usedto generate a prediction signal. For example, the quantized signal maybe subjected to an inverse quantization and an inverse transform via theinverse quantization unit 140 and the inverse transform unit 150 in theloop respectively to reconstruct a residual signal. The reconstructedresidual signal may be added to the prediction signal output from theinter-prediction unit 180 or intra-prediction unit 185 to generate areconstructed signal.

Meanwhile, in the compression process, adjacent blocks may be quantizedby different quantization parameters, so that deterioration of the blockboundary may occur. This phenomenon is called blocking artifacts. Thisis one of important factors for evaluating image quality. A filteringprocess may be performed to reduce such deterioration. Using thefiltering process, the blocking deterioration may be eliminated, and, atthe same time, an error of a current picture may be reduced, therebyimproving the image quality.

The filtering unit 160 may apply filtering to the reconstructed signaland then outputs the filtered reconstructed signal to a reproducingdevice or the decoded picture buffer 170. The filtered signaltransmitted to the decoded picture buffer 170 may be used as a referencepicture in the inter-prediction unit 180. In this way, using thefiltered picture as the reference picture in the inter-pictureprediction mode, not only the picture quality but also the codingefficiency may be improved.

The decoded picture buffer 170 may store the filtered picture for use asthe reference picture in the inter-prediction unit 180.

The inter-prediction unit 180 may perform temporal prediction and/orspatial prediction with reference to the reconstructed picture to removetemporal redundancy and/or spatial redundancy. In this case, thereference picture used for the prediction may be a transformed signalobtained via the quantization and inverse quantization on a block basisin the previous encoding/decoding. Thus, this may result in blockingartifacts or ringing artifacts.

Accordingly, in order to solve the performance degradation due to thediscontinuity or quantization of the signal, the inter-prediction unit180 may interpolate signals between pixels on a subpixel basis using alow-pass filter. In this case, the subpixel may mean a virtual pixelgenerated by applying an interpolation filter. An integer pixel means anactual pixel existing in the reconstructed picture. The interpolationmethod may include linear interpolation, bi-linear interpolation andWiener filter, etc.

The interpolation filter is applied to a reconstructed picture, and thuscan improve the precision of a prediction. For example, the interprediction unit 180 may generate an interpolated pixel by applying theinterpolation filter to an integer pixel, and may perform a predictionusing an interpolated block configured with interpolated pixels as aprediction block.

In an embodiment of the present invention, the inter prediction unit 180may determine an optimal prediction weight using a template.

Furthermore, the inter prediction unit 180 may obtain an optimalpredictor using an adaptive weight parameter in a generalizedbi-prediction.

Furthermore, the inter prediction unit 180 may obtain an optimalpredictor by explicitly determining a weight index.

Furthermore, the inter prediction unit 180 may determine a templateregion using neighboring pixels of an L0 reference block, an L1reference block and a current block.

Furthermore, the inter prediction unit 180 may determine anuncorrelation factor indicating an uncorrelation between an L0 referenceblock and a current block and an uncorrelation between an L1 referenceblock and the current block based on the template of the L0 referenceblock, the L1 reference block and the current block.

Furthermore, the inter prediction unit 180 may adaptively determine aweight parameter using an uncorrelation factor.

The intra prediction unit 185 may predict a current block with referenceto samples peripheral to a block to be now encoded. The intra predictionunit 185 may perform the following process in order to perform intraprediction. First, the prediction unit may prepare a reference samplenecessary to generate a prediction signal. Furthermore, the predictionunit may generate a prediction signal using the prepared referencesample. Thereafter, the prediction unit encodes a prediction mode. Inthis case, the reference sample may be prepared through reference samplepadding and/or reference sample filtering. The reference sample mayinclude a quantization error because a prediction and reconstructionprocess has been performed on the reference sample. Accordingly, inorder to reduce such an error, a reference sample filtering process maybe performed on each prediction mode used for intra prediction.

The prediction signal generated through the inter prediction unit 180 orthe intra prediction unit 185 may be used to generate a reconstructedsignal or may be used to generate a residual signal.

FIG. 2 is an embodiment to which the present invention is applied andshows a schematic block diagram of a decoder by which the decoding of avideo signal is performed.

Referring to FIG. 2, the decoder 200 may be configured to include aparsing unit (not shown), an entropy decoding unit 210, a dequantizationunit 220, an inverse transform unit 230, a filtering unit 240, a decodedpicture buffer (DPB) unit 250, an inter prediction unit 260, an intraprediction unit 265 and a reconstruction unit (not shown).

For another example, the decoder 200 may be simply represented asincluding a parsing unit (not shown), a block split determination unit(not shown) and a decoding unit (not shown). In this case, embodimentsto which the present invention is applied may be performed through theparsing unit (not shown), the block split determination unit (not shown)and the decoding unit (not shown).

The decoder 200 may receive a signal output by the encoder 100 of FIG.1, and may parse or obtain a syntax element through the parsing unit(not shown). The parsed or obtained signal may be entropy-decodedthrough the entropy decoding unit 210.

The dequantization unit 220 obtains a transform coefficient from theentropy-decoded signal using quantization step size information.

The inverse transform unit 230 obtains a residual signal by inverselytransforming the transform coefficient.

The reconstruction unit (not shown) generates a reconstructed signal byadding the obtained residual signal to a prediction signal output by theinter prediction unit 260 or the intra prediction unit 265.

The filtering unit 240 applies filtering to the reconstructed signal andtransmits the filtered signal to a playback device or transmits thefiltered signal to the decoded picture buffer unit 250. The filteredsignal transmitted to the decoded picture buffer unit 250 may be used asa reference picture in the inter prediction unit 260.

In this specification, the embodiments described in the filtering unit160, inter prediction unit 180 and intra prediction unit 185 of theencoder 100 may be identically applied to the filtering unit 240, interprediction unit 260 and intra prediction unit 265 of the decoder,respectively.

In an embodiment of the present invention, the inter prediction unit 260may determine an optimal prediction weight using a template.

Furthermore, the inter prediction unit 260 may obtain an optimalpredictor using an adaptive weight parameter in a generalizedbi-prediction.

Furthermore, the inter prediction unit 260 may obtain an optimalpredictor using a weight index extracted from a video signal.

Furthermore, the inter prediction unit 260 may determine a templateregion based on neighboring pixels of an L0 reference block, an L1reference block and a current block. Alternatively, the template regionmay have already been determined in the encoder and/or the decoder.

Furthermore, the inter prediction unit 260 may determine anuncorrelation factor indicating an uncorrelation between an L0 referenceblock and a current block and an uncorrelation between an L1 referenceblock and the current block based on the template of the L0 referenceblock, the L1 reference block and the current block.

Furthermore, the inter prediction unit 260 may adaptively determine aweight parameter using an uncorrelation factor.

A reconstructed video signal output through the decoder 200 may beplayed back through a playback device.

FIG. 3 is a diagram illustrating a split structure of a coding unitaccording to an embodiment of the present invention.

The encoder may split one video (or picture) in a coding tree unit (CTU)of a quadrangle form. The encoder sequentially encodes by one CTU inraster scan order.

For example, a size of the CTU may be determined to any one of 64×64,32×32, and 16×16, but the present invention is not limited thereto. Theencoder may select and use a size of the CTU according to a resolutionof input image or a characteristic of input image. The CTU may include acoding tree block (CTB) of a luma component and a coding tree block(CTB) of two chroma components corresponding thereto.

One CTU may be decomposed in a quadtree (hereinafter, referred to as‘QT’) structure. For example, one CTU may be split into four units inwhich a length of each side reduces in a half while having a squareform. Decomposition of such a QT structure may be recursively performed.

Referring to FIG. 3, a root node of the QT may be related to the CTU.The QT may be split until arriving at a leaf node, and in this case, theleaf node may be referred to as a coding unit (CU).

The CU may mean a basic unit of a processing process of input image, forexample, coding in which intra/inter prediction is performed. The CU mayinclude a coding block (CB) of a luma component and a CB of two chromacomponents corresponding thereto. For example, a size of the CU may bedetermined to any one of 64×64, 32×32, 16×16, and 8×8, but the presentinvention is not limited thereto, and when video is high resolutionvideo, a size of the CU may further increase or may be various sizes.

Referring to FIG. 3, the CTU corresponds to a root node and has asmallest depth (i.e., level 0) value. The CTU may not be split accordingto a characteristic of input image, and in this case, the CTUcorresponds to a CU.

The CTU may be decomposed in a QT form and thus subordinate nodes havinga depth of a level 1 may be generated. In a subordinate node having adepth of a level 1, a node (i.e., a leaf node) that is no longer splitcorresponds to the CU. For example, as shown in FIG. 3B, CU(a), CU(b),and CU(j) corresponding to nodes a, b, and j are split one time in theCTU and have a depth of a level 1.

At least one of nodes having a depth of a level 1 may be again split ina QT form. In a subordinate node having a depth of a level 2, a node(i.e., a leaf node) that is no longer split corresponds to a CU. Forexample, as shown in FIG. 3B, CU(c), CU(h), and CU(i) corresponding tonodes c, h, and I are split twice in the CTU and have a depth of a level2.

Further, at least one of nodes having a depth of a level 2 may be againsplit in a QT form. In a subordinate node having a depth of a level 3, anode (i.e., a leaf node) that is no longer split corresponds to a CU.For example, as shown in FIG. 3B, CU(d), CU(e), CU(f), and CU(g)corresponding to d, e, f, and g are split three times in the CTU andhave a depth of a level 3.

The encoder may determine a maximum size or a minimum size of the CUaccording to a characteristic (e.g., a resolution) of video or inconsideration of encoding efficiency. Information thereof or informationthat can derive this may be included in a bit stream. A CU having amaximum size may be referred to as a largest coding unit (LCU), and a CUhaving a minimum size may be referred to as a smallest coding unit(SCU).

Further, the CU having a tree structure may be hierarchically split withpredetermined maximum depth information (or maximum level information).Each split CU may have depth information. Because depth informationrepresents the split number and/or a level of the CU, the depthinformation may include information about a size of the CU.

Because the LCU is split in a QT form, when using a size of the LCU andmaximum depth information, a size of the SCU may be obtained.Alternatively, in contrast, when using a size of the SCU and maximumdepth information of a tree, a size of the LCU may be obtained.

For one CU, information representing whether a corresponding CU is splitmay be transferred to the decoder. For example, the information may bedefined to a split flag and may be represented with “split_cu_flag”. Thesplit flag may be included in the entire CU, except for the SCU. Forexample, when a value of the split flag is ‘1’, a corresponding CU isagain split into four CUs, and when a value of the split flag is ‘0’, acorresponding CU is no longer split and a coding process of thecorresponding CU may be performed.

In an embodiment of FIG. 3, a split process of the CU is exemplified,but the above-described QT structure may be applied even to a splitprocess of a transform unit (TU), which is a basic unit that performstransform.

The TU may be hierarchically split in a QT structure from a CU to code.For example, the CU may correspond to a root node of a tree of thetransform unit (TU).

Because the TU is split in a QT structure, the TU split from the CU maybe again split into a smaller subordinate TU. For example, a size of theTU may be determined to any one of 32×32, 16×16, 8×8, and 4×4, but thepresent invention is not limited thereto, and when the TU is highresolution video, a size of the TU may increase or may be various sizes.

For one TU, information representing whether a corresponding TU is splitmay be transferred to the decoder. For example, the information may bedefined to a split transform flag and may be represented with a“split_transform_flag”.

The split transform flag may be included in entire TUs, except for a TUof a minimum size. For example, when a value of the split transform flagis ‘1’, a corresponding TU is again split into four TUs, and a value ofthe split transform flag is ‘0’, a corresponding TU is no longer split.

As described above, the CU is a basic unit of coding that performs intraprediction or inter prediction. In order to more effectively code inputimage, the CU may be split into a prediction unit (PU).

A PU is a basic unit that generates a prediction block, and a predictionblock may be differently generated in a PU unit even within one CU. ThePU may be differently split according to whether an intra predictionmode is used or an inter prediction mode is used as a coding mode of theCU to which the PU belongs.

FIG. 4 is an embodiment to which the present invention is applied and isa diagram for illustrating a prediction unit.

A PU is differently partitioned depending on whether an intra-predictionmode or an inter-prediction mode is used as the coding mode of a CU towhich the PU belongs.

FIG. 4(a) illustrates a PU in the case where the intra-prediction modeis used as the coding mode of a CU to which the PU belongs, and FIG.4(b) illustrates a PU in the case where the inter-prediction mode isused as the coding mode of a CU to which the PU belongs.

Referring to FIG. 4(a), assuming the case where the size of one CU is2N×2N (N=4, 8, 16 or 32), one CU may be partitioned into two types(i.e., 2N×2N and N×N).

In this case, if one CU is partitioned as a PU of the 2N×2N form, thismeans that only one PU is present within one CU.

In contrast, if one CU is partitioned as a PU of the N×N form, one CU ispartitioned into four PUs and a different prediction block for each PUis generated. In this case, the partition of the PU may be performedonly if the size of a CB for the luma component of a CU is a minimumsize (i.e., if the CU is an SCU).

Referring to FIG. 4(b), assuming that the size of one CU is 2N×2N (N=4,8, 16 or 32), one CU may be partitioned into eight PU types (i.e.,2N×2N, N×N, 2N×N, N×2N, nL×2N, nR×2N, 2N×nU and 2N×nD).

As in intra-prediction, the PU partition of the N×N form may beperformed only if the size of a CB for the luma component of a CU is aminimum size (i.e., if the CU is an SCU).

In inter-prediction, the PU partition of the 2N×N form in which a PU ispartitioned in a traverse direction and the PU partition of the N×2Nform in which a PU is partitioned in a longitudinal direction aresupported.

Furthermore, the PU partition of nL×2N, nR×2N, 2N×nU and 2N×nD forms,that is, asymmetric motion partition (AMP) forms, are supported. In thiscase, ‘n’ means a ¼ value of 2N. However, the AMP cannot be used if a CUto which a PU belongs is a CU of a minimum size.

In order to efficiently code an input image within one CTU, an optimumpartition structure of a coding unit (CU), a prediction unit (PU) and atransform unit (TU) may be determined based on a minimum rate-distortionvalue through the following execution process. For example, an optimumCU partition process within a 64×64 CTU is described. A rate-distortioncost may be calculated through a partition process from a CU of a 64×64size to a CU of an 8×8 size, and a detailed process thereof is asfollows.

1) A partition structure of an optimum PU and TU which generates aminimum rate-distortion value is determined by performinginter/intra-prediction, transform/quantization and inversequantization/inverse transform and entropy encoding on a CU of a 64×64size.

2) The 64×64 CU is partitioned into four CUs of a 32×32 size, and anoptimum partition structure of a PU and a TU which generates a minimumrate-distortion value for each of the 32×32 CUs is determined.

3) The 32×32 CU is partitioned into four CUs of a 16×16 size again, andan optimum partition structure of a PU and a TU which generates aminimum rate-distortion value for each of the 16×16 CUs is determined.

4) The 16×16 CU is partitioned into four CUs of an 8×8 size again, andan optimum partition structure of a PU and a TU which generates aminimum rate-distortion value for each of the 8×8 CUs is determined.

5) An optimum partition structure of a CU within a 16×16 block isdetermined by comparing the rate-distortion value of a 16×16 CUcalculated in the process 3) with the sum of the rate-distortion valuesof the four 8×8 CUs calculated in the process 4). This process isperformed on the remaining three 16×16 CUs in the same manner.

6) An optimum partition structure of a CU within a 32×32 block isdetermined by comparing the rate-distortion value of a 32×32 CUcalculated in the process 2) with the sum of the rate-distortion valuesof the four 16×16 CUs calculated in the process 5). This process isperformed on the remaining three 32×32 CUs in the same manner.

7) Finally, an optimum partition structure of a CU within a 64×64 blockis determined by comparing the rate-distortion value of the 64×64 CUcalculated in the process 1) with the sum of the rate-distortion valuesof the four 32×32 CUs obtained in the process 6).

In the intra-prediction mode, a prediction mode is selected in a PU unitand prediction and a reconfiguration are performed in an actual TU unitwith respect to the selected prediction mode.

The TU means a basic unit by which actual prediction and areconfiguration are performed. The TU includes a transform block (TB)for a luma component and a TB for two chroma components corresponding tothe TB for a luma component.

In the example of FIG. 3, as in the case where one CTU is partitioned asa quadtree structure to generate a CU, a TU is hierarchicallypartitioned as a quadtree structure from one CU to be coded.

The TU is partitioned as a quadtree structure, and thus a TU partitionedfrom a CU may be partitioned into smaller lower TUs. In HEVC, the sizeof the TU may be determined to be any one of 32×32, 16×16, 8×8 and 4×4.

Referring back to FIG. 3, it is assumed that the root node of a quadtreeis related to a CU. The quadtree is partitioned until a leaf node isreached, and the leaf node corresponds to a TU.

More specifically, a CU corresponds to a root node and has the smallestdepth (i.e., depth=0) value. The CU may not be partitioned depending onthe characteristics of an input image. In this case, a CU corresponds toa TU.

The CU may be partitioned in a quadtree form. As a result, lower nodesof a depth 1 (depth=1) are generated. Furthermore, a node (i.e., leafnode) that belongs to the lower nodes having the depth of 1 and that isno longer partitioned corresponds to a TU. For example, in FIG. 3(b), aTU(a), a TU(b) and a TU(j) corresponding to nodes a, b and j,respectively, have been once partitioned from the CU, and have the depthof 1.

At least any one of the nodes having the depth of 1 may be partitionedin a quadtree form again. As a result, lower nodes having a depth 2(i.e., depth=2) are generated. Furthermore, a node (i.e., leaf node)that belongs to the lower nodes having the depth of 2 and that is nolonger partitioned corresponds to a TU. For example, in FIG. 3(b), aTU(c), a TU(h) and a TU(i) corresponding to nodes c, h and i,respectively, have been twice partitioned from the CU, and have thedepth of 2.

Furthermore, at least any one of the nodes having the depth of 2 may bepartitioned in a quadtree form again. As a result, lower nodes having adepth 3 (i.e., depth=3) are generated. Furthermore, a node (i.e., leafnode) that belongs to the lower nodes having the depth of 3 and that isno longer partitioned corresponds to a CU. For example, in FIG. 3(b), aTU(d), a TU(e), a TU(f) and a TU(g) corresponding to nodes d, e, f andg, respectively, have been partitioned three times from the CU, and havethe depth of 3.

A TU having a tree structure has predetermined maximum depth information(or the greatest level information) and may be hierarchicallypartitioned. Furthermore, each partitioned TU may have depthinformation. The depth information may include information about thesize of the TU because it indicates the partitioned number and/or degreeof the TU.

Regarding one TU, information (e.g., a partition TU flag“split_transform_flag”) indicating whether a corresponding TU ispartitioned may be transferred to the decoder. The partition informationis included in all of TUs other than a TU of a minimum size. Forexample, if a value of the flag indicating whether a corresponding TU ispartitioned is “1”, the corresponding TU is partitioned into four TUsagain. If a value of the flag indicating whether a corresponding TU ispartitioned is “0”, the corresponding TU is no longer partitioned.

FIG. 5 is an embodiment to which the present invention is applied and isa diagram for illustrating a quadtree binarytree (hereinafter referredto as a “QTBT”) block split structure.

Quad-Tree Binary-Tree (QTBT)

A QTBT refers to a structure of a coding block in which a quadtreestructure and a binarytree structure have been combined. Specifically,in a QTBT block split structure, an image is coded in a CTU unit. A CTUis split in a quadtree form. A leaf node of a quadtree is additionallysplit in a binarytree form.

Hereinafter, a QTBT structure and a split flag syntax supporting thesame are described with reference to FIG. 5.

Referring to FIG. 5, a current block may be split in a QTBT structure.That is, a CTU may be first hierarchically split in a quadtree form.Furthermore, a leaf node of the quadtree that is no longer spit in aquadtree form may be hierarchically split in a binarytree form.

The encoder may signal a split flag in order to determine whether tosplit a quadtree in a QTBT structure. In this case, the quadtree splitmay be adjusted (or limited) by a MinQTLumalSlice, MinQTChromalSlice orMinQTNonlSlice value. In this case, MinQTLumalSlice indicates a minimumsize of a quadtree leaf node of a luma component in an I-slice.MinQTLumaChromalSlice indicates a minimum size of a quadtree leaf nodeof a chroma component in an I-slice. MinQTNonlSlice indicates a minimumsize of a quadtree leaf node in a non-I-slice.

In the quadtree structure of a QTBT, a luma component and a chromacomponent may have independent split structures in an I-slice. Forexample, in the case of an I-slice in the QTBT structure, the splitstructures of a luma component and chroma component may be differentlydetermined. In order to support such a split structure, MinQTLumalSliceand MinQTChromalSlice may have different values.

For another example, in a non-I-slice of a QTBT, the split structures ofa luma component and chroma component in the quadtree structure may beidentically determined. For example, in the case of a non-I-slice, thequadtree split structures of a luma component and chroma component maybe adjusted by a MinQTNonlSlice value.

In the QTBT structure, a leaf node of the quadtree may be split in abinarytree form. In this case, the binarytree split may be adjusted (orlimited) by MaxBTDepth, MaxBTDepthlSliceL and MaxBTDepthlSliceC. In thiscase, MaxBTDepth indicates a maximum depth of the binarytree split basedon a leaf node of the quadtree in a non-I-slice. MaxBTDepthlSliceLindicates a maximum depth of the binarytree split of a luma component inan I-slice. MaxBTDepthlSliceC indicates a maximum depth of thebinarytree split of a chroma component in the I-slice.

Furthermore, in the I-slice of a QTBT, MaxBTDepthlSliceL andMaxBTDepthlSliceC may have different values in the I-slice because aluma component and a chroma component may have different structures.

In the case of the split structure of a QTBT, the quadtree structure andthe binarytree structure may be used together. In this case, thefollowing rule may be applied.

First, MaxBTSize is smaller than or equal to MaxQTSize. In this case,MaxBTSize indicates a maximum size of a binarytree split, and MaxQTSizeindicates a maximum size of the quadtree split.

Second, a leaf node of a QT becomes the root of a BT.

Third, once a split into a BT is performed, it cannot be split into a QTagain.

Fourth, a BT defines a vertical split and a horizontal split.

Fifth, MaxQTDepth, MaxBTDepth are previously defined. In this case,MaxQTDepth indicates a maximum depth of a quadtree split, and MaxBTDepthindicates a maximum depth of a binarytree split.

Sixth, MaxBTSize, MinQTSize may be different depending on a slice type.

FIG. 6 is an embodiment to which the present invention is applied and isa diagram for illustrating a uni-directional inter prediction and abi-directional inter prediction.

An inter prediction may be divided into a uni-directional prediction inwhich only one past picture or future picture is used as a referencepicture on a time axis with respect to one block and a bi-directionalprediction in which reference is made to the past and future pictures atthe same time.

FIG. 6(a) shows a uni-directional prediction, and FIG. 6(b) shows abi-directional prediction.

Referring to FIG. 6(a), it may be seen that a current picture is presentin a T0 time and refers to a picture in a (T−2) time for interprediction. Furthermore, referring to FIG. 6(b), it may be seen that acurrent picture is present in a T0 time and refers to two pictures, thatis, a picture in a (T−2) time and a picture in a T1 time, for interprediction.

Furthermore, uni-directional prediction may be divided into forwarddirection prediction using one reference picture displayed (or output)prior to a current picture temporally and backward direction predictionusing one reference picture displayed (or output) after a currentpicture temporally.

In an inter prediction process (i.e., uni-directional or bi-directionalprediction), a motion parameter (or information) used to specify thatwhich reference region (or reference block) is used to predict a currentblock includes an inter prediction mode (in this case, the interprediction mode may indicate a reference direction (i.e.,uni-directional or bi-directional) and a reference list (i.e., L0, L1 orbi-direction)), a reference index (or reference picture index orreference list index), motion vector information. The motion vectorinformation may include a motion vector, motion vector prediction (MVP)or a motion vector difference (MVD). The motion vector difference meansa difference between a motion vector and a motion vector prediction.

In uni-directional prediction, a motion parameter for one-side directionis used. That is, one motion parameter may be necessary to specify areference region (or reference block).

In a bi-directional prediction, a motion parameter for both-sidedirections is used. In a bi-directional prediction method, a maximum oftwo reference regions may be used. The two reference regions may bepresent in the same reference picture or may be present in differentpictures, respectively. That is, in the bi-directional predictionmethod, a maximum of two motion parameters may be used. Two motionvectors may have the same reference picture index or may have differentreference picture indices. In this case, both the reference pictures maybe displayed (or output) prior to a current picture temporally or may bedisplayed (or output) after a current picture.

The encoder performs motion estimation for finding a reference regionmost similar to a current processing block from reference pictures in aninter prediction process. Furthermore, the encoder may provide thedecoder with a motion parameter for the reference region.

The encoder or the decoder may obtain the reference region of thecurrent processing block using the motion parameter. The referenceregion is present in a reference picture having the reference index.Furthermore, a pixel value or interpolated value of the reference regionspecified by a motion vector may be used as the predictor of the currentprocessing block. That is, motion compensation for predicting an imageof the current processing block from a previously decoded picture usingmotion information is performed.

In order to reduce the amount of transmission related to motion vectorinformation, a method of obtaining a motion vector prediction (mvp)using motion information of previously coded blocks and transmittingonly a difference (mvd) therefor may be used. That is, the decoderobtains a motion vector prediction of a current processing block usingmotion information of decoded other blocks and obtains a motion vectorvalue for the current processing block using a difference transmitted bythe encoder. In obtaining the motion vector prediction, the decoder mayobtain various motion vector candidate values using motion informationof already decoded other blocks, and may obtain one of the motion vectorcandidate values as a motion vector prediction.

FIG. 7 is an embodiment to which the present invention is applied and isa diagram for illustrating a generalized bi-prediction method.

Generalized Bi-Prediction

The present invention provides a generalized bi-prediction method forobtaining a bi-directional predictor in inter coding.

Referring to FIG. 7, it is assumed that an L0 reference picture and anL1 reference picture are present with reference to a current picture. Inthis case, assuming that a predictor obtained from an L0 reference blockis an L0 predictor and a predictor obtained from an L1 reference blockis an L1 predictor, the present invention may obtain an optimalpredictor by adaptively applying a weight to the L0 predictor and the L1predictor.

In an embodiment, as in Equation 1, a bi-directional predictor may beobtained using an adaptive weight.P[x]=(1−w)*P ₀[x+v0]+w*P ₁[x+v1]  [Equation 1]

In this case, P[x] means a predictor at the x location of a currentblock. P_(i)[x+vi], ∀i∈{0,1} means a motion-compensated prediction blockobtained using a motion vector (MV) v_(i) in a reference picture L_(i).(1−w) and w mean weight values. In this case, a set W of the weightvalues may be configured as an embodiment like Equations 2 to 4.w={⅜,½,⅝}  [Equation 2]w={¼,⅜,½,⅝,¾}  [Equation 3]w={−¼,¼,⅜,½,⅝,¾,5/4}  [Equation 4]

Bit allocations for the weight values of Equations 2 to 4 are the sameas Tables 1 to 3, respectively.

Tables 1 to 3 show index binarization schemes for the weight values ofEquations 2 to 4, respectively.

TABLE 1 Binarization Schemes Scheme #1 Scheme #2 (mvd_l1_zero_flag =(mvd_l1_zero_flag = Index Weight value 0) 1) 0 ⅜ 00 00 1 ½ 1 01 2 ⅝ 01 1

TABLE 2 Binarization Schemes Scheme #1 Scheme #2 (mvd_l1_zero_flag =(mvd_l1_zero_flag = Index Weight value 0) 1) 0 ¼ 0000 0000 1 ⅜ 001 00012 ½ 1 01 3 ⅝ 01 1 4 ¾ 0001 001

TABLE 3 Binarization Schemes Scheme #1 Scheme #2 (mvd_l1_zero_flag =(mvd_l1_zero_flag = Index Weight value 0) 1) 0 −1/4 000000 000000 1 ¼00001 000001 2 ⅜ 001 0001 3 ½ 1 01 4 ⅝ 01 1 5 ¾ 0001 001 6 5/4 00000100001

In Tables 1 to 3, mvd_I1_zero_flag is determined in a slice header. Whenmvd_I1_zero_flag=1, an MVD value of L0 is determined as 0 and only anMVD value of L1 is transmitted. When mvd_I1_zero_flag=0, the MVD valuesof L0 and L1 are transmitted.

FIG. 8 is an embodiment to which the present invention is applied and isa flowchart for illustrating a process of determining an optimalprediction method.

The present embodiment is described on the basis of the encoder, forconvenience of description, but the present invention is not limitedthereto and may be performed by the decoder within the range performedby the decoder.

The encoder may perform a prediction according to a skip mode or a mergemode (S810).

The encoder may perform uni-directional prediction in an L0 direction(S820) or may perform uni-directional prediction in an L1 direction(S830).

Furthermore, the encoder may perform a bi-directional prediction towhich a weight has been applied (S840). In this case, a weightdetermination method or a bi-directional prediction method described inthis specification may be applied to the bi-directional prediction towhich a weight has been applied.

Steps S810 to S840 are not limited to their sequence. The encoder mayperform at least one of a skip mode, a merge mode, an L0 uni-directionalprediction, an L1 uni-directional prediction or a bi-directionalprediction to which a weight has been applied.

The encoder may determine an optimal predictor among predictorscalculated by the above-described prediction methods (S850). In thiscase, the optimal predictor may mean a value that minimizes a differencebetween a pixel value of a current block and the predictor or may mean avalue that minimizes a rate-distortion (RD) cost.

Hereinafter, a weight determination method in order to obtain an optimalpredictor and a method of performing a bi-directional prediction towhich a weight has been applied are described.

FIG. 9 is an embodiment to which the present invention is applied and isa flowchart for illustrating a process of performing an optimalbi-directional prediction using a weight index.

As described in FIG. 8, the encoder may perform at least one of a skipmode, a merge mode, an L0 uni-directional prediction, an L1uni-directional prediction or a bi-directional prediction to which aweight has been applied, and may determine an optimal predictor amongthem.

In this case, in the case of the bi-directional prediction to which aweight has been applied, the weight may be determined based on any oneof Tables 1 to 3. For example, in the case of Table 3, the weight indexmay be set to 0 to 6. In this case, the weight may mean a valuecorresponding to the weight index.

An embodiment of the present invention provides a method of obtaining anoptimal predictor based on a bi-directional prediction to which a weighthas been applied.

Referring to FIG. 9, the encoder may first set a weight index to 0(S910).

The encoder may check whether the weight index is smaller than N (S920).For example, if the weight indices of Table 3 are used, the N value maybe 7.

When the weight index is smaller than 7, the encoder may determine avalue, corresponding to the weight index, as a weight value (S930).

Furthermore, the encoder may apply the weight value to an L0 predictorand an L1 predictor (S940). In this case, Equation 1 may be used.

The encoder adds 1 to the weight value and may perform steps S920 toS940 again.

The encoder may obtain an optimal predictor among predictors obtainedthrough the loop process (S960). In this case, the optimal predictor maybe calculated based on a weight-applied L0 predictor and L1 predictor.The finally determined weight value may mean a value that minimizes adifference between a pixel value of the current block and the predictoror may mean a value that minimizes a rate-distortion (RD) cost.Furthermore, a weight index corresponding to the finally determinedweight value may mean an optimal weight index.

Furthermore, if the weight indices of Table 3 are used, the encoder mayrepeatedly perform steps S920 to S950 while a weight index has the valueof 0˜6.

The present invention is not limited thereto. If the weight indices ofTable 1 and Table 2 are used, the same method may be applied.

FIG. 10 is an embodiment to which the present invention is applied andshows a syntax structure in which an optimal bi-directional predictionis performed using a weight index.

An embodiment of the present invention provides a method of performing abi-directional prediction using a weight index, and provides a method ofdefining a weight index in the decoder.

First, the decoder may confirm which prediction method is used withrespect to a current prediction unit. For example, any one of an L0uni-directional prediction, an L1 uni-directional prediction or abi-directional prediction may be used as a prediction method.

Referring to FIG. 10, the decoder may confirm whether a bi-directionalprediction is performed (S1010). For example, the decoder may confirmwhether a bi-directional prediction is performed throughif(inter_pred_idc[x0][y0]==PRED_BI). In this case, inter_pred_idc maymean whether an L0 uni-directional prediction, an L1 uni-directionalprediction or a bi-directional prediction is used for a currentprediction unit. PRED_BI may mean a bi-directional prediction.

If a bi-directional prediction is performed according to step S1010, thedecoder may extract a weight index (S1020). In this case, the weightindex may be represented as gbi_idx[x0][y0]. For example, the weightindex may be defined according to the embodiments of Tables 1 to 3.

When the weight index is extracted, the decoder may derive a weightvalue corresponding to the weight index.

The decoder may obtain a bi-directional predictor by applying the weightvalue to an L0 predictor and an L1 predictor. In this case, Equation 1may be used.

The decoder may reconstruct a current block using the bi-directionalpredictor.

Meanwhile, as in Tables 1 to 3, in order to determine (or derive) aweight value (or weight parameter), a corresponding weight index needsto be transmitted. In order to transmit the weight index, an additionalbit needs to be used. For example, as in Table 3, when mvd_I1_zero_flagis 0, bits necessary to represent a weight index is 1 bit in the case ofw=½, 2 bits in the case of w=⅝, 3 bits in the case of w=⅜, 4 bits in thecase of w=¾, 5 bits in the case of w=¼, and 6 bits in the case of w=−¼or 5/4.

Furthermore, if encoding is performed and optimal weight values aredetermined with respect to all weight values, coding complexity may bevery great.

Furthermore, an optimal weight value can be calculated more preciselybecause the limited number of 3 weight values (or weight parameters) areused in the case of Table 1, the limited number of 5 weight values (orweight parameters) are used in the case of Table 2, and the limitednumber of 7 weight values (or weight parameters) are used in the case ofTable 3. However, the above example has a problem in that a finer weightvalue, such as 1/16, 2/16, cannot be used. In summary, a bi-directionalprediction using a weight table based on a weight index may have thefollowing problems. For example, additional bits are necessary, encodercomplexity increases, and the limited number of weight parameters isused.

Accordingly, another embodiment of the present invention proposes aweight determination method value based on a template in order to solvethe problems.

In the present embodiment, an optimal weight value may be determinedusing small bits without using an additional bit. Furthermore, whencompared with a bi-directional prediction using a weight table based ona weight index, encoder complexity is low because only a combination ofsmaller bit cases is taken into consideration. Furthermore, in thepresent invention, a weight value (or weight parameter) is not limitedbecause the weight value is determined based on a template. Accordingly,more accurate expression is possible.

This is described below more specifically.

FIG. 11 is an embodiment to which the present invention is applied andis a diagram for illustrating a method of performing an adaptivebi-directional prediction using a template.

Adaptive Bi-Directional Prediction

The present invention provides a method of determining an optimal weightvalue using a template in inter coding and adaptively performing abi-directional prediction using the optimal weight value. In thisspecification, a weight determined (or obtained) using a template may becalled a template-based weight.

Referring to FIG. 11, it is assumed that an L0 reference picture and anL1 reference picture are present on the basis of a current picture. Inthis case, assuming that a predictor obtained from an L0 reference blockis an L0 predictor and a predictor obtained from an L1 reference blockis an L1 predictor, the present invention may obtain an optimalpredictor by adaptively applying a weight to the L0 predictor and the L1predictor.

In this case, the present invention may use a template (or templateregion) in order to determine a weight value (or weight parameter).

Furthermore, the present invention may use a template in order to searchfor an optimal L0/L1 predictor or to search for an optimal L0/L1reference block.

In this case, the template may mean a set of pixels having a pre-definedform, and this may have already been determined in the encoder or thedecoder. However, the present invention is not limited thereto. Thetemplate may be adaptively determined based on at least one of thecharacteristics (common parameter, prediction method, size, form, etc.)of a sequence, picture, slice, block, a coding unit or prediction unit.

For example, referring to FIG. 11, the template may mean a set of L linepixels (L=1˜(N−1)) neighboring at least one of the left, upper andupper-left vertexes of a current block (N×N). In this specification, thetemplate has been described based on a block, but the present inventionis not limited thereto as described above.

As a detailed example, the template may determine or calculateuncorrelation factors (ucL0, ucL1) based on a pixel value difference atthe same location between the template of an L0/L1 reference block andthe template of a current block (or the sum of absolute values of thepixel value differences). In this case, the uncorrelation factor maymean a value indicating an uncorrelation (or correlation) between areference block and a current block. Alternatively, the uncorrelationfactor may mean a value indicating an uncorrelation (or correlation)between the template of a reference block and the template of a currentblock.

When the uncorrelation factors are determined, the encoder or thedecoder may determine a weight value using the factors, and may performa bi-directional prediction based on the weight value.

In an embodiment, as in Equation 5, a bi-directional predictor may beobtained using a weight value w.P[x]=(1−w)*P ₀[x+v0]+w*P ₁[x+v1]  [Equation 5]

In this case, P[x] means a predictor at the x location of a currentblock. P_(i)[x+vi], ∀i∈{0,1} means a motion-compensated prediction blockobtained using a motion vector (MV) v_(i) in a reference picture L_(i).(1−w) and w mean weight values. In this case, the weight value w may bedetermined based on the uncorrelation factors. This is described morespecifically in an embodiment below.

FIG. 12 is an embodiment to which the present invention is applied andis a flowchart for illustrating a process of performing an adaptivebi-directional prediction using a template.

The encoder or the decoder may determine the template or template regionof at least one of an L0 reference block, an L1 reference block and acurrent block (S1210). In this case, the template may mean a set ofpixels having a pre-defined form, and this may have already beendetermined in the encoder or the decoder. For example, the template maymean a set of L line pixels (L=1˜(N−1)) neighboring at least one of theleft, upper and upper-left vertexes of a current block (N×N) or areference block (N×N). However, the present invention is not limitedthereto, the template may be adaptively determined based on at least oneof the characteristics (common parameter, prediction method, size, form,etc.) of a sequence, picture, slice, block, coding unit or predictionunit.

The encoder or the decoder may determine uncorrelation factors (ucL0,ucL1) based on the template region (S1220). In this case, theuncorrelation factors may include an L0 uncorrelation factor (ucL0) forthe L0 reference block and an L1 uncorrelation factor (ucL1) for the L1reference block.

The encoder or the decoder may calculate or determine a weight valueusing the uncorrelation factors (ucL0, ucL1) (S1230).

FIG. 13 is an embodiment to which the present invention is applied andis a diagram illustrating a method of determining a template region inorder to perform an adaptive bi-directional prediction.

Method of Determining Template Region

The present invention provides various methods of determining a templateor template region as in step S1210 of FIG. 12.

First, referring to FIG. 13(a), a template Template_X may mean a set ofL line pixels (L=1˜(N−1)) neighboring at least one of the left, upperand upper-left vertexes of a block (N×N).

For another example, referring to FIG. 13(b), a template Template_Y maymean a set of L line pixels (L=1˜(N−1)) neighboring at least one of theleft, right, upper side and lower side and four vertexes of a block(N×N). That is, the template Template_Y may be configured with pixels ofa form surrounding the block. In this case, a pixel value neighboringthe right or lower side (may include vertex value) of the block may bedetermined by a pixel value neighboring at least one of the left orupper side (may include vertex value) of the block, but the presentinvention is not limited thereto.

In the present invention, the above two cases have been taken as anexample, but the present invention is not limited thereto. For example,a template may be configured with pixels neighboring at least one of theleft, right, upper side and lower side or four vertexes of a block(N×N).

FIG. 14 is an embodiment to which the present invention is applied andis a flowchart for illustrating a process of performing an adaptivebi-directional prediction based on a template without using a weightindex.

Method of Determining Uncorrelation Factors

The present invention provides a method of determining uncorrelationfactors based on a template region.

In an embodiment, the uncorrelation factors may be determined orcalculated by Equation 6 and Equation 7.

$\begin{matrix}{{{ucL}\; 0} = {\sum\limits_{i \in R}{f\left( {c_{i},{p\; 0_{i}}} \right)}}} & \left\lbrack {{Equation}\mspace{14mu} 6} \right\rbrack \\{{{ucL}\; 1} - {\sum\limits_{i \in R}{f\left( {{p\; 1_{i}},c_{i}} \right)}}} & \left\lbrack {{Equation}\mspace{14mu} 7} \right\rbrack\end{matrix}$

In this case, ucL0 indicates an L0 uncorrelation factor indicating anuncorrelation between an L0 reference block and a current block. ucL1indicates an L1 uncorrelation factor indicating an uncorrelation betweenan L1 reference block and a current block.

R indicates a template region. p0_(i) indicates a pixel value of an L0template region. p1_(i) indicates a pixel value of an L0 templateregion. c_(i) indicates a pixel value of a current template.Furthermore, f(x,y)=abs(x−y).

Furthermore, the present invention is not limited thereto, and theuncorrelation factors may be substituted according to various methodsfor obtaining a correlation or uncorrelation.

Method of Determining Weight Parameter

The present invention provides a weight determination method parameterusing uncorrelation factors. In this specification, a weight parameterand a weight value may be interchangeably used.

In an embodiment, a weight parameter w may be determined or calculatedby Equation 8.

$\begin{matrix}{w = \frac{{ucL}\; 1}{{{ucL}\; 0} + {{ucL}\; 1}}} & \left\lbrack {{Equation}\mspace{14mu} 8} \right\rbrack\end{matrix}$

The present invention provides an adaptive bi-directional predictionmethod, which may be performed as follows.

The encoder or the decoder may determine a template region (S1410). Inthis case, the embodiments described in this specification may beapplied to the template region.

The encoder or the decoder may determine uncorrelation factors based onthe template region (S1420). In this case, the embodiments described inthis specification may be applied to the uncorrelation factors.

The encoder or the decoder may determine a weight parameter using theuncorrelation factors (S1430). In this case, the embodiments describedin this specification may be applied to the weight parameter.

The encoder or the decoder may obtain a bi-directional predictor usingthe weight parameter (S1440).

The present invention can obtain an optimal predictor withouttransmitting a separate index indicating a weight parameter by derivinga weight parameter based on at least one of a template region oruncorrelation factors as described above.

Meanwhile, particularly, a real value must be able to be represented byleft shift operation of 2 because encoding and decoding need to beimplemented to be easily designed in terms of hardware. Accordingly,Equation 8 may be defined like Equation 9 and Equation 10.

$\begin{matrix}{t = \frac{{ucL}\; 1}{{{ucL}\; 0} + {{ucL}\; 1}}} & \left\lbrack {{Equation}\mspace{14mu} 9} \right\rbrack \\{w = \frac{{round}\left( {{step}*t} \right)}{step}} & \left\lbrack {{Equation}\mspace{14mu} 10} \right\rbrack\end{matrix}$

In this case, t is a given real value between 0˜1. Assuming that step=4,t has one value of w=(0, ¼, 2/4, ¾, 1). The step is a given number andmay be 4, 8, 16, 32, 64, for example.

FIG. 15 is an embodiment to which the present invention is applied andis a flowchart for illustrating a process of performing an optimalbi-directional prediction based on a method of determining two or moreweights.

The present invention provides a method of performing an optimalbi-directional prediction based on a method of determining two or moreweights. In this case, if a weight determination method is different, aweight parameter may be divided into a first weight parameter and asecond weight parameter and a predictor may be divided into a firstpredictor and a second predictor, for distinction between terms. Thepresent invention is not limited thereto. If the same term is used, afirst term and a second term may be separately used, which may also beapplied to other terms.

The encoder or the decoder may determine a template region (S1510). Inthis case, the embodiments described in this specification may beapplied to the template region.

The encoder or the decoder may determine uncorrelation factors based onthe template region (S1520). In this case, the embodiments described inthis specification may be applied to the uncorrelation factors.

The encoder or the decoder may determine a weight parameter using theuncorrelation factors (S1530). In this case, the embodiments describedin this specification may be applied to the weight parameter.

The encoder or the decoder may obtain a first predictor using the weightparameter (S1540).

Meanwhile, the encoder or the decoder may determine the weight parameterusing various methods. For example, the encoder or the decoder maydetermine the weight parameter as ½, that is, a predetermined value(S1550).

For another example, as described in FIGS. 7 to 10, the weight parametermay be determined based on a weight index. For example, a valuecorresponding to the weight index may be derived from a predeterminedweight table.

The encoder or the decoder may obtain a second predictor based on theweight parameter determined in step S1550 (S1560).

The encoder or the decoder may compare the first predictor with thesecond predictor (S1570), and may obtain an optimal predictor (S1580).

In another embodiment of the present invention, the encoder may assign aseparate index (hereinafter referred to as an “adaptive weight index”)regarding whether to perform steps S1510 to S1540 or perform steps S1550to S1560, and may explicitly transmit the index.

For example, adaptive weight indices may be defined using Table 4 below.

TABLE 4 Index Weight Binarization 0 ½ 0 1 w 1

In this case, when an adaptive weight index is 0, a weight parametermeans ½. When an adaptive weight index is 1, a weight parameter means w.In this case, the above-described methods may be applied to a method ofdetermining the w value.

The encoder may perform encoding with respect to the case of alladaptive weight indices, and may determine an optimal weight parameterbased on a result of the execution. Furthermore, a correspondingadaptive weight index is signaled and transmitted to the decoder.

The syntax structure of FIG. 10 may be identically used for a method ofsignaling the adaptive weight index.

Another embodiment of the present invention provides various methods ofdefining an adaptive weight index.

For example, an adaptive weight index may be defined using Table 5 orTable 6.

Referring to Table 5, when an adaptive weight index is 0, a weightparameter means ½ and is represented as a binarization bit 0. When anadaptive weight index is 1, a weight parameter means w and isrepresented as a binarization bit 10. When an adaptive weight index is2, a weight parameter means w′ and is represented as a binarization bit11. In this case, w′ may be determined as the closest value among valuessmaller than w or a value closer to w among values greater than w.

TABLE 5 Index Weight Binarization 0 ½ 0 1 w 10 2  w′ 11

TABLE 6 Index Weight Binarization 0 w 0 1 ½ 10 2  w′ 11 . . . . . . . .. n − 1

Likewise, referring to Table 6, when an adaptive weight index is 0, aweight parameter means w and is represented as a binarization bit 0.When an adaptive weight index is 1, a weight parameter means ½ and isrepresented as a binarization bit 10. When an adaptive weight index is2, a weight parameter means w′ and is represented as a binarization bit11. In this case, w′ may be determined as the closest value among valuessmaller than w or a value closer to w among values greater than w.

The present invention may explicitly use n weight parameters by adding anew weight parameter as in Table 6.

As described above, the embodiments described in the present inventionmay be implemented and executed on a processor, microprocessor,controller or chip. For example, the function units shown in FIGS. 1 and2 may be implemented and executed on a computer, processor,microprocessor, controller or chip

Furthermore, the decoder and the encoder to which the present inventionis applied may be included in a multimedia broadcastingtransmission/reception apparatus, a mobile communication terminal, ahome cinema video apparatus, a digital cinema video apparatus, asurveillance camera, a video chatting apparatus, a real-timecommunication apparatus such as video communication, a mobile streamingapparatus, a storage medium, a camcorder, a VoD service providingapparatus, an Internet streaming service providing apparatus, athree-dimensional (3D) video apparatus, a teleconference videoapparatus, and a medical video apparatus, and may be used to processvideo signals and data signals

Furthermore, the decoding/encoding method to which the present inventionis applied may be produced in the form of a program that is to beexecuted by a computer and may be stored in a computer-readablerecording medium. Multimedia data having a data structure according tothe present invention may also be stored in computer-readable recordingmedia. The computer-readable recording media include all types ofstorage devices in which data readable by a computer system is stored.The computer-readable recording media may include a BD, a USB, ROM, RAM,CD-ROM, a magnetic tape, a floppy disk, and an optical data storagedevice, for example. Furthermore, the computer-readable recording mediaincludes media implemented in the form of carrier waves, e.g.,transmission through the Internet. Furthermore, a bit stream generatedby the encoding method may be stored in a computer-readable recordingmedium or may be transmitted over wired/wireless communication networks.

INDUSTRIAL APPLICABILITY

The exemplary embodiments of the present invention have been disclosedfor illustrative purposes, and those skilled in the art may improve,change, replace, or add various other embodiments within the technicalspirit and scope of the present invention disclosed in the attachedclaims.

The invention claimed is:
 1. A method of performing a bi-directionalprediction using a template-based weight by an apparatus, the methodcomprising: determining a template region for performing thebi-directional prediction on a current block, wherein the templateregion indicates a set of one or more L line pixels defined as pixels onL number of line (L=1, . . . , N−1, where N is a width or a height ofthe current block or reference block.) neighboring at least one of left,upper and upper-left vertexes of the current block or the referenceblock or indicates a set of one or more L line pixels defined as pixelson L number of line (L=1, . . . , N−1, where N is the width or theheight of the current block or the reference block.) neighboring atleast one of a left, right, upper side, lower side and four vertexes ofthe current block or the reference block; calculating an uncorrelationfactor based on the template region, wherein the uncorrelation factormeans a value indicating an uncorrelation between a template of thecurrent block and a template of a reference block, wherein theuncorrelation factor is derived by a sum of absolute values of pixelvalue differences between the template of the current block and thetemplate of the reference block; determining a first weight parameter ofthe current block based on the uncorrelation factor; and obtaining afirst predictor of the current block using the weight parameter.
 2. Themethod of claim 1, wherein the uncorrelation factor includes an L0uncorrelation factor for an L0 reference block and an L1 uncorrelationfactor for an L1 reference block.
 3. The method of claim 1, wherein thefirst weight parameter of the current block is determined by an equation((the L1 uncorrelation factor)/(the L0 uncorrelation factor+the L1uncorrelation factor)).
 4. The method of claim 1, further comprising:determining a second 2′ weight parameter of the current block as a ½value and obtaining a second predictor using the second 2 weightparameter; and determining an optimal predictor of the current blockbased on a rate-distortion cost of the first predictor and the secondpredictor.
 5. The method of claim 4, further comprising: signaling anadaptive weight index corresponding to each weight parameter, whereinthe first weight parameter and the second weight parameter are indicatedby the adaptive weight index.
 6. The method of claim 4, wherein thesecond weight parameter is determined based on a predetermined weighttable.
 7. An apparatus for performing a bi-directional prediction on acurrent block using a template-based weight, the apparatus comprising: aprocessor configured to determine a template region for performing thebi-directional prediction on a current block, wherein the templateregion indicates a set of one or more L line pixels defined as pixels onL number of line (L=1, . . . , N−1, where N is a width or a height ofthe current block or reference block.) neighboring at least one of left,upper and upper-left vertexes of the current block or the referenceblock or indicates a set of one or more L line pixels defined as pixelson L number of line (L=1, . . . , N−1, where N is the width or theheight of the current block or the reference block.) neighboring atleast one of a left, right, upper side, lower side and four vertexes ofthe current block or the reference block, calculate an uncorrelationfactor based on the template region, determine a first weight parameterof the current block based on the uncorrelation factor, and obtain afirst predictor of the current block using the weight parameter, whereinthe uncorrelation factor is derived by a sum of absolute values of pixelvalue differences between the template of the current block and thetemplate of the reference block; and reconstruct the current block usingthe first predictor, wherein the uncorrelation factor means a valueindicating an uncorrelation between a template of the current block anda template of a reference block; and a memory configured to storeinformation instructing to perform an operation of the processor.
 8. Theapparatus of claim 7, wherein the uncorrelation factor includes an L0uncorrelation factor for an L0 reference block and an L1 uncorrelationfactor for an L1 reference block.
 9. The apparatus of claim 7, wherein afirst weight parameter of the current block is determined by an equation((the L1 uncorrelation factor)/(the L0 uncorrelation factor+the Liuncorrelation factor)).
 10. The apparatus of claim 7, wherein theprocessor is configured to: determine a second 2′ weight parameter ofthe current block as a ½ value and obtaining a second predictor usingthe second 2′ weight parameter, and determine an optimal predictor ofthe current block based on a rate-distortion cost of the first predictorand the second predictor.
 11. The apparatus of claim 10, furthercomprising: the processor configured to extract, from a video signal, anadaptive weight index corresponding to each weight parameter, whereinthe first weight parameter and the second weight parameter are indicatedby the adaptive weight index.
 12. The apparatus of claim 10, wherein thesecond weight parameter is determined based on a predetermined weighttable.